Building ancient Spanish dictionaries for spell-checking of DL texts
نویسندگان
چکیده
Being aware of the usefulness of spell-checkers on the correction of modern works, and lacking this facility for ancient texts, we decided to build dictionaries for ancient Spanish. This decision led to new problems and new questions. We have built a time-aware system of dictionaries that takes into account the temporal dynamics of language, to help solve the problem of ancient Spanish spell-checking. In this paper we present the problems we have found, the decisions we have made and the conclusions and results we arrived at.
منابع مشابه
Spell Checking in Spanish: The Case of Diacritic Accents
This article presents the problem of diacritic restoration (or diacritization) in the context of spell-checking, with the focus on an orthographically rich language such as Spanish. We argue that despite the large volume of work published on the topic of diacritization, currently available spell-checking tools have still not found a proper solution to the problem in those cases where both forms...
متن کاملCreating and Weighting Hunspell Dictionaries as Finite-State Automata
There are numerous formats for writing spell-checkers for open-source systems and there are many lexical descriptions for natural languages written in these formats. In this paper, we demonstrate a method for converting Hunspell and related spell-checking lexicons into finite-state automata. We also present a simple way to apply unigram corpus training in order to improve the spellchecking sugg...
متن کاملRule-Based Spanish Morphological Analyzer Built From Spell Checking Lexicon
Preprocessing tools for automated text analysis have become more widely available in major languages, but non-English tools are often still limited in their functionality. When working with Spanishlanguage text, researchers can easily find tools for tokenization and stemming, but may not have the means to extract more complex word features like verb tense or mood. Yet Spanish is a morphological...
متن کاملCompiling Apertium morphological dictionaries with HFST and using them in HFST applications
In this paper we aim to improve interoperability and re-usability of the morphological dictionaries of Apertium machine translation system by formulating a generic finite-state compilation formula that is implemented in HFST finite-state system to compile Apertium dictionaries into general purpose finite-state automata. We demonstrate the use of the resulting automaton in FST-based spell-checki...
متن کاملUsing Google to Create a More Accurate and Easily-Extensible Spell Corrector
Spell checkers are now a common, integrated part of many commercial and freely available word processing programs. Agglutinative languages (such as Hungarian and Finnish) pose a separate problem, as there are many different " correct " forms for any given word. Due to the seemingly infinite number of possible words, the limited scope of a dictionary (provided with most spell-checking software) ...
متن کامل